Dear Statalist users,
We are trying to estimate the following regression model, where the unit of observation is a child:

where I{ } is an indicator function that is equal to 1 if the statement inside the curly brackets is true, and 0 otherwise.
As an example, suppose that we have four grandparent couples (denoted GP1, GP2, GP3, and GP4 in the figure below) and three grandchildren (denoted gc1, gc2, and gc3 in the figure below).

where paternal grandparents are connected to their grandchild with a solid line and maternal grandparents are connected to their grandchild with a dashed line.
Conceptually, there is nothing complicated about this model. Each observation (child) will have two dummy variables equal to 1 (one for each of his/her grandparents), and all the other dummy variables equal to zero. Continuing with the example above, the table below shows which dummy variables would be equal to 1 for each grandchild (which is the unit of observation in the above model):

However, we find it difficult to estimate this model with our sample. The reason is as follows: since there are millions of grandparent couples associated with our sample, it is not computationally feasible to have one dummy variable for each of these grandparent couples.
An alternative would be to assign each grandparent couple a numeric id, and create two variables: paternal grandparents' id and maternal grandparents' id, and use reghdfe and include these two variables as fixed effects.
However, we run into the following issue: the maternal grandparents of some children in our sample are the paternal grandparents of other children in our sample.
Continuing with the example above, we can assign to each grandchild the values of his/her paternal grandparents' id and maternal grandparents' id, as follows:

In this example, if we use the id's of paternal and maternal grandparents as separate fixed effects (using, e.g., reghdfe), we cannot let Stata know that ``2'' in the paternal-grandparents-fixed effects variable is the same grandparents as ``2'' in the maternal-grandparents-fixed effect variable.
We are wondering if anyone knows any Stata command that can handle this case. Any help would be greatly appreciated.
Thank you!
Sam
We are trying to estimate the following regression model, where the unit of observation is a child:
where I{ } is an indicator function that is equal to 1 if the statement inside the curly brackets is true, and 0 otherwise.
As an example, suppose that we have four grandparent couples (denoted GP1, GP2, GP3, and GP4 in the figure below) and three grandchildren (denoted gc1, gc2, and gc3 in the figure below).
where paternal grandparents are connected to their grandchild with a solid line and maternal grandparents are connected to their grandchild with a dashed line.
Conceptually, there is nothing complicated about this model. Each observation (child) will have two dummy variables equal to 1 (one for each of his/her grandparents), and all the other dummy variables equal to zero. Continuing with the example above, the table below shows which dummy variables would be equal to 1 for each grandchild (which is the unit of observation in the above model):
However, we find it difficult to estimate this model with our sample. The reason is as follows: since there are millions of grandparent couples associated with our sample, it is not computationally feasible to have one dummy variable for each of these grandparent couples.
An alternative would be to assign each grandparent couple a numeric id, and create two variables: paternal grandparents' id and maternal grandparents' id, and use reghdfe and include these two variables as fixed effects.
However, we run into the following issue: the maternal grandparents of some children in our sample are the paternal grandparents of other children in our sample.
Continuing with the example above, we can assign to each grandchild the values of his/her paternal grandparents' id and maternal grandparents' id, as follows:
In this example, if we use the id's of paternal and maternal grandparents as separate fixed effects (using, e.g., reghdfe), we cannot let Stata know that ``2'' in the paternal-grandparents-fixed effects variable is the same grandparents as ``2'' in the maternal-grandparents-fixed effect variable.
We are wondering if anyone knows any Stata command that can handle this case. Any help would be greatly appreciated.
Thank you!
Sam
Comment